GINS4 might be a novel prognostic immune-related biomarker of not only esophageal squamous cell carcinoma and other cancers

Background Immunotherapy using immune checkpoint inhibitors (ICIs), such as antibody of programmed death-1 (PD-1)/programmed death-ligand 1 (PD-L1) has showed as a promising treatment for esophageal squamous cell carcinoma (ESCC), but resistance is unavoidable. This study aimed to find more immune-related genes to promote the efficiency of immunotherapy. Materials and methods Three datasets were downloaded from Gene Expression Omnibus (GEO) database. Gene differential analysis was performed to identify differentially expressed genes (DEGs), then ceRNA network was constructed based on differentially expressed lncRNAs and mRNAs. Next, Functional enrichment analysis and protein–protein interaction (PPI) network were built to reveal the potential function of mRNAs in ceRNA network. Survival analysis and immune cell infiltration level analysis were utilized to identify prognostic immune-related genes. Finally, pan-cancer analysis was performed to show the role of immune-related genes in other cancers. Results The data of 215 samples in total were obtained from GEO database (98 normal tissues and 117 tumor tissues), and 1685 differentially expressed mRNAs (176 downregulated and 1509 upregulated) and 3 upregulated lncRNAs (MCM3AP-AS1, HCP5 and GUSBP11, all upregulated) were found. ceRNA network was constructed to reveal some special correlation. Function enrichment showed some potential functions of mRNAs in ceRNA network such as mitotic cell cycle process, negative regulation of DNA-binding transcription factor activity, ossification, VEGFA-VEGFR2 signaling pathway, epithelial to mesenchymal transition, embryonic morphogenesis and so on. PPI network showed the physical interactions between each mRNA in ceRNA network. Through survival analysis and immune cell infiltration level analysis, GINS4 was confirmed as an immune-related prognostic gene in ESCC. GSEA showed some potential functions such as negative regulation of monocyte chemotaxis, antigen processing and presentation of endogenous peptide antigen via MHC class I via ER pathway, positive regulation of antigen processing and presentation, dendritic cell antigen processing and presentation and so on. Finally, pan-cancer analysis revealed that GINS4 might be a novel immune-related prognostic gene in ESCC and other cancers. Conclusion Our study suggested that GINS4 was correlated with prognosis and immune cell infiltration level of ESCC and other cancers. It may deserve further investigation as a potential immune-related prognostic biomarker of ESCC and other cancers.


Introduction
Esophageal cancer (EC) is one of the most common malignant tumors with a high incidence and mortality in the world [1]. It consists of two major histological types, esophageal squamous cell carcinoma (ESCC) and esophageal adenocarcinoma (EAC). ESCC is more commonly seen than EAC in the eastern countries [2]. Surgery or surgery-based multimodality therapy is currently the most effective treatment for ESCC [3]. For locally advanced ESCC, neoadjuvant chemotherapy combined with radiotherapy or immunotherapy is now the mainstay for improvement of survival [4][5][6].
Though great progress has been made in the biological functions of driver gene for esophageal cancer, there are still lack of effective targeted drugs for ESCC. In recent years, immune checkpoint inhibitors (ICIs), such as the antibody of programmed death-1 (PD-1)/programmed death-ligand 1 (PD-L1) has showed its promising efficiency as neoadjuvant or adjuvant therapy, especially when combined with chemotherapy. Studies have shown that PD-1/PD-L1 inhibitors such as nivolumab, pembrolizumab can significantly improve the objective response ratio (ORR) with a good safety and prolong over survival (OS) in both first-line and second-line treatment for advanced ESCC [7][8][9][10]. In fact, according to some phase I-III trials, though ICIs have showed its effectiveness, only a subset of patients would benefit from them [11]. For example, the overall response rate (ORR) was 30% in in KEYNOTE-028 trial [9] and 9.9% in KEYNOTE-180 trial [12]. What's more, ATT RAC TION-3 trial, a multicenter, randomized phase III trial, showed that the ORR was only 19% [13]. Therefore, it's urgent to explore more immunotherapy approaches to benefit more patients.
Studies have shown that tumor immune microenvironment (TIME) including tumor-infiltrating lymphocytes (TIL), tumor-associated macrophages (TAM), and myeloid-derived suppressor cells (MDSC) may contribute to the resistance to immunotherapy, but the mechanisms are still unknown [14][15][16][17]. Therefore, immune-related genes and immunotherapy-related biomarkers are urgently needed, which may help to elucidate the molecular mechanisms of TIME on tumorigenesis, and also improve the efficiency of immunotherapy. In this study, we aimed to search for novel immune-related genes through integrated analysis on the data downloaded from Gene Expression Omnibus (GEO) database and The Cancer Genome Atlas (TCGA) database.

Gene sets acquisition
The data of ESCC datasets GSE33426 (normal: 12, tumor: 59), GSE38129 (normal: 30, tumor: 30) and GSE161533 (normal: 56, tumor: 28) were downloaded from GEO database. Then the three datasets were merged into one new dataset, and package "sva" in software R (ver. 4.1.0) was utilized to remove the batch effect. Next, the merged dataset was divided the into mRNA group and lncRNA group according to the GENCODE project (http:// www. genco degen es. org).

Gene differential analysis
In order to find out the differentially expressed genes (DEGs), the gene differential analysis was performed between tumor and normal tissues on mRNA group and lncRNA group by "limma" package. |LogFC|> 1 and adjusted p value < 0.05 were considered statistically significant for the DEGs.

Functional enrichment analysis and protein-protein interaction network
In order to reveal the potential function of mRNA in ceRNA network, we uploaded the mRNAs in ceRNA network to the website Metascape (https:// metas cape. org/ gp) [22]. In this website, we could perform function enrichment analysis including gene ontology (GO) analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis and protein-protein interaction (PPI) enrichment analysis.

Survival analyses
Due to lack of clinicopathological features in GEO datasets, we downloaded ESCC gene expression profile and corresponding clinicopathological features from The Cancer Genome Atlas (TCGA) (https:// portal. gdc. cancer. gov/). Then survival analyses were carried out based on the lncRNAs and mRNAs in ceRNA network. Kaplan-Meier (KM) survival curves were constructed by "survival" and "survminer" packages. Log-rank p < 0.05 indicated a significance difference.

Immune cell infiltration level
To explore whether the prognostic mRNAs were correlated with TIME, we uploaded tumor expression profile of GEO dataset to Tumor IMmune Estimation Resource (TIMER) database (https:// cistr ome. shiny apps. io/ timer/) to calculate six immune cell infiltration level (B cell, T cell CD4+, T cell CD8+, Neutrophil, Macrophage and Myeloid dendritic cell) [23]. Spearman analyses were performed to reveal the correlation between the expression level of prognostic mRNAs and six immune cell infiltration level. A significance difference was indicated when p < 0.05 and the mRNAs were confirmed as immunerelated mRNAs. Furthermore, we also performed Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data (ESTIMATE) analysis based on tumor expression profile of GEO dataset by using "estimate" package to calculate immune score, then Spearman analyses were performed between immune score and immune-related genes to validate the correlation between immune score and immunerelated mRNAs. A significance difference was indicated when p < 0.05. Correlations between expression level of immune-related mRNAs and biomarkers of immune cells were also revealed. The biomarkers were obtained from previous study [24] and CellMarker database (http:// biocc. hrbmu. edu. cn/ CellM arker/ index. jsp) [25].

Gene set enrichment analysis
To reveal the potential function of immune-related genes in ESCC tumor, we carried out Gene set enrichment analysis (GSEA) based on each immune-related mRNA by software GSEA (ver. 4.1.0). The tumor expression profile was divided into high expression groups and low expression group based on the median expression level of the aimed mRNA each time when a GSEA was performed. GSEA was conducted based on the high and low express groups. A significant functional enrichment was indicated if p < 0.05.

Pan-cancer analysis
In order to search out the role of the immune-related prognostic genes found in the ESCC in other cancers, a pan-cancer analysis was conducted, including gene differential analysis, survival analysis and immune cell infiltration analysis. Gene differential analysis and immune cell infiltration analysis were performed via TIMER database, and survival analysis was carried out by GEPIA database (http:// gepia. cancer-pku. cn/ index. html).

Gene expression profile data
The work flow of this study was shown in Fig. 1. The data of 215 samples in total were obtained from GEO database (98 normal tissues and 117 tumor tissues). After normalization, gene differential analysis was conducted. According to |LogFC|> 1 and adjusted p value < 0.05, in mRNA group, 1685 mRNAs were differentially expressed (176 were downregulated and 1509 were upregulated); in lncRNA group, only 3 lncRNAs (MCM3AP-AS1, HCP5 and GUSBP11) were differentially expressed and all upregulated ( Fig. 2A-D). The results of gene differential analysis were showed by volcano plots and the gene expression was showed by heatmaps.

ceRNA network
After prediction, 3 lncRNAs, 34 miRNAs and 169 mRNAs were included in the ceRNA network (Fig. 3A). In the ceRNA network, lncRNAs could regulate downstream mRNAs by regulating correlated miRNAs [26]. Three special networks were showed in the Fig. 3B. In these three networks, miRNAs only correlated with one lncRNA, and mRNAs only correlated with one miRNA.  These networks were unique, and showed special functions in ESCC.

Functional enrichment analysis and PPI network
Functional enrich analysis showed that mRNAs in ceRNA network were mainly enriched in mitotic cell cycle process, negative regulation of DNA-binding transcription factor activity, ossification, VEGFA-VEGFR2 signaling pathway, epithelial to mesenchymal transition, embryonic morphogenesis and so on (Fig. 4A). Some functions such as VEGFA-VEGFR2 signaling pathway, epithelial to mesenchymal transition, epithelial cell differentiation, regulation of cell development, regulation of cell adhesion may contribute to the tumorigenesis and metastasis of ESCC. Furthermore, enriched functions with a similarity > 0.3 were connected by edges ( Fig. 4B-C). PPI network showed the physical interactions between each mRNA in ceRNA network. The Molecular Complex Detection (MCODE) algorithm has been applied to identify densely connected network components ( Fig. 5A-B). Pathway and process enrichment analysis has been applied to each MCODE component independently, and the three best-scoring terms by p-value have been retained as the functional description of the corresponding components (Table 1).

KM survival curves from TCGA database
The data of 80 patients' tumor samples with clinicopathological features were obtained from TCGA database, of those 56 were still alive and 24 dead with a median survival time of 395.5 days. Survival analyses was performed and KM curves was built on the 3 lncRNAs and 169 mRNAs in ceRNA network. The results showed that all 3 lncRNAs were not prognostic while 9 upregulated mRNAs (POLR3D, GINS4, LMNB2, SLC7A6, PHTF2, RBL1, RNF2, IRAK1 and ZWINT) were prognostic ( Fig. 6A-I).

Correlation between immune cell infiltration level and prognostic mRNAs
Six immune cell infiltration level were acquired from TIMER database. Correlation analyses between six immune cell infiltration level and prognostic mRNAs were conducted. Results showed that all 9 mRNAs were correlated with somewhat infiltration level of the 6 immune cells (Fig. 7A-I). To validate these results, we also uploaded tumor expression profile of TCGA dataset to TIMER database, and the same analyses were performed. Results showed that only GINS4, PHTF2 and SLC7A6 were correlated with immune cell infiltration. GINS4 was correlated with the infiltration of B cell and myeloid dendritic cell; PHTF2 was correlated with the infiltration of macrophage; SLC7A6 was correlated with the infiltration of B cell, macrophage and myeloid dendritic cell ( Fig. 8A-C). Both datasets showed that GINS4 was correlated with B cell infiltration, and GEO dataset showed GINS4 was also correlated with the infiltration of neutrophil and T cell CD4+; TCGA dataset showed GINS4 was also correlated with the infiltration of myeloid dendritic cell. In GEO dataset, PHTF2 was correlated with the infiltration of all immune cells except B cell; however, in TCGA dataset, PHTF2 was only correlated with macrophage. In GEO dataset, SLC7A6 was correlated with the infiltration of all immune cells, but in TCGA dataset, it only correlated with infiltration of 3 immune cells. In conclusion, the above results indicated that GINS4, SLC7A6 and PHTF2 might be immunerelated genes. ESTIMATE analysis was conducted for validating the results, and correlation analyses between immune-related genes and immune score were performed. The results showed that only GINS4 was correlated with immune score (r = − 0.3, p = 0.001), while SLC7A6 (r = 0.170, p = 0.069) and PHTF2 (r = − 0.020, p = 0.831) were excluded. Therefore, GINS4 was confirmed as an immune-related gene.

Correlation of GINS4 expression with biomarkers of immune cells
To further explore the role of GINS4 in tumor immune, we determined the expression correlation of GINS4 with biomarkers of B cell, CD4 + T cell, neutrophil, dendritic cell and myeloid dendritic cell based on GEO datasets. As list in Table 2, GINS4 was significantly correlated with B cell's biomarker (CD19 and CD79A), one biomarker of dendritic cell (HLA-DQB1) and 5 biomarkers of myeloid dendritic cell (CD40, CD80, CD83, CD207 and CD209). These findings partially supported that GINS4 is positively correlated with immune cell infiltration.

Potential function of GINS4
The expression profiles of the tumors were divided into high expression and low expression group based on the median expression level of GINS4, and GSEA was carried out based on the two groups. The results showed that high expression group was mainly enriched in G Protein coupled glutamate receptor signaling pathway and fibroblast growth factor production; Low expression group was mainly enriched in negative regulation of monocyte chemotaxis, antigen processing and presentation of endogenous peptide antigen via MHC class I via ER pathway, positive regulation of antigen processing and presentation, dendritic cell antigen processing and presentation, and positive regulation of dendritic cell antigen processing and presentation (Fig. 9). These results demonstrated the role of GINS4 in ESCC tumor immune.

Pan-cancer analysis of GINS4
Gene differential analysis was performed in 33 types of cancer by TIMER database. The results showed that GINS4 was differentially expressed in bladder urothelial carcinoma (BLCA), breast invasive carcinoma    (Fig. 11). Immune cell infiltration analysis was based on the top 10 cancer in global morbidity via TIMER database. GINS4 was correlated with immune cell infiltration in all cancers, especially in BLCA, BRCA, COAD, LIHC, LUAD, prostate adenocarcinoma (PRAD), STAD, thyroid carcinoma (THCA) (Fig. 12). In conclusion, GINS4 might be a novel immune-related prognostic gene in ESCC and other cancers.

Discussion
In recent years, immunotherapy for esophageal cancer is developing rapidly, and has showed very promising results in the neoadjuvant and or adjuvant therapy for ESCC. Therefore, in order to predict the efficiency and prognosis of the esophageal cancer patients receiving immunotherapy, a lot of studies has been conducted on searching for immune-related genes. In this study, we systematically collected data from GEO database, and performed gene differential analysis. ceRNA network was constructed based on DEGs, and function enrichment   *** *** *** *** *** *** · *** *** *** *** *** *** *** *** · *** ***   and PPI network were performed based on DEGs in ceRNA network. Then, survival analysis and immune cell infiltration level analysis were carried out to search for immune-related prognostic genes. The results showed that GINS4 was immune-related prognostic gene and GSEA revealed its potential function. We also performed a pan-cancer analysis to investigate the role of GINS4 in other cancers. ceRNA networks link the function of protein-coding mRNAs with that of non-coding RNAs such as micro-RNA, long non-coding RNA, pseudogenic RNA and circular RNA. lncRNAs could competitively bind to the shared miRNAs, and their expression is then positively correlated. The upregulation of one lncRNA results in more sequestrated copies of shared miRNAs [26]. In our study, ceRNA network was constructed by DEGs, indicating that the regulation network plays a role in ESCC, especially the 3 unique networks in Fig. 2B.
PPI enrichment analysis revealed the physical interactions among DEGs in ceRNA network, and MCODE showed the potential protein complex and their functions. Some functions such as negative regulation of NF-kappaB transcription factor activity [27], TGF-beta signaling pathway [28] may contribute to tumorigenesis of ESCC.
Our study found that GINS4 might be a novel biomarker correlated with the prognosis not only in ESCC but also in many other cancers. Go, Ichi, Nii, and San (means five, one, two, and three, respectively, in Japanese) complex subunit 4 (GINS4) is a member of GINS family, which are essential for the initiation of DNA replication in yeast and Xenopus egg extracts [29]. Yang et al. reported that GINS4 was highly expressed in non-small cell lung cancer (NSCLC) and was associated with the prognosis of NSCLC, especially LUAD. They also found that overexpression of GINS4 promotes cancer cell growth, migration and invasion [30]. Zhu et al. found that GINS4 was highly expressed in gastric cancer and correlated closely with gastric cancer clinicopathological features such as OS and disease-free survival (DFS) [31]. Similar results were also found in colorectal cancer [32], pancreatic cancer [33] and hepatocellular carcinoma [34] and breast cancer [35].
In ceRNA network, GINS4 was regulated via HCP5/miR-17-5p axis. Histocompatibility leukocyte antigen complex P5 (HCP5) is a lncRNA located between the MICA (MHC class I polypeptide-related sequence A) and MICB (MHC class I polypeptide-related sequence B) genes in the MHC I region of chromosome 6p21.33, and it is mainly expressed in immune system. HCP5 was associated with tumorigenesis of many cancers such as pancreatic cancer, colorectal cancer, lung cancer and so on [36]. Thus, HCP5 may function by regulating GINS4.
Furthermore, our study also found that GINS4 was correlated with immune cell infiltration not only in ESCC but also in many other cancers. In ESCC, GINS4 was significantly associated with the infiltration level of B cell. It was also correlated with CD4+ T cell and myeloid dendritic cell according to GEO and TCGA dataset. GSEA showed GINS4 might have an important role in immune system. Therefore, GINS4 was confirmed as an immune-related prognostic gene in ESCC. However, no related researches have reported it until present.
The limitation of this study is that the GINS4 was confirmed as an immune-related prognostic gene in ESCC based on the analysis of the data downloaded from GEO and TCGA database. Further molecular biology experiments are required to investigate its function and regulation mechanism.

Conclusion
Our study found that GINS4 might be a novel immunerelated prognostic gene in ESCC. It is highly expressed in ESCC, and may be regulated via HCP5 / miR-17-5p axis. It may also play an important role in other cancers. Therefore, it could be a new target gene, which provides a new therapeutic target in many malignant tumors. Further studies are required to investigate its function.